Overview

Dataset statistics

Number of variables25
Number of observations10302
Missing cells3004
Missing cells (%)1.2%
Duplicate rows1
Duplicate rows (%)< 0.1%
Total size in memory2.0 MiB
Average record size in memory200.0 B

Variable types

Numeric9
Categorical13
Boolean3

Alerts

Dataset has 1 (< 0.1%) duplicate rowsDuplicates
INCOME has a high cardinality: 8151 distinct valuesHigh cardinality
HOME_VAL has a high cardinality: 6334 distinct valuesHigh cardinality
BLUEBOOK has a high cardinality: 2985 distinct valuesHigh cardinality
OLDCLAIM has a high cardinality: 3545 distinct valuesHigh cardinality
CLM_AMT has a high cardinality: 2346 distinct valuesHigh cardinality
AGE is highly overall correlated with HOMEKIDSHigh correlation
HOMEKIDS is highly overall correlated with AGE and 1 other fieldsHigh correlation
PARENT1 is highly overall correlated with HOMEKIDSHigh correlation
GENDER is highly overall correlated with CAR_TYPE and 1 other fieldsHigh correlation
EDUCATION is highly overall correlated with OCCUPATIONHigh correlation
OCCUPATION is highly overall correlated with EDUCATION and 1 other fieldsHigh correlation
CAR_USE is highly overall correlated with OCCUPATION and 1 other fieldsHigh correlation
CAR_TYPE is highly overall correlated with GENDER and 1 other fieldsHigh correlation
RED_CAR is highly overall correlated with GENDERHigh correlation
KIDSDRIV is highly imbalanced (71.1%)Imbalance
OLDCLAIM is highly imbalanced (53.1%)Imbalance
CLM_AMT is highly imbalanced (66.1%)Imbalance
YOJ has 548 (5.3%) missing valuesMissing
INCOME has 570 (5.5%) missing valuesMissing
HOME_VAL has 575 (5.6%) missing valuesMissing
OCCUPATION has 665 (6.5%) missing valuesMissing
CAR_AGE has 639 (6.2%) missing valuesMissing
HOMEKIDS has 6694 (65.0%) zerosZeros
YOJ has 807 (7.8%) zerosZeros
CLM_FREQ has 6292 (61.1%) zerosZeros
MVR_PTS has 4658 (45.2%) zerosZeros

Reproduction

Analysis started2023-04-29 09:07:42.508155
Analysis finished2023-04-29 09:08:00.515393
Duration18.01 seconds
Software versionpandas-profiling v3.6.6
Download configurationconfig.json

Variables

ID
Real number (ℝ)

Distinct8753
Distinct (%)85.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4.9566311 × 108
Minimum63175
Maximum9.9992637 × 108
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size80.6 KiB
2023-04-29T12:08:00.653551image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum63175
5-th percentile50696156
Q12.4428686 × 108
median4.9700429 × 108
Q37.3945507 × 108
95-th percentile9.4436522 × 108
Maximum9.9992637 × 108
Range9.9986319 × 108
Interquartile range (IQR)4.9516821 × 108

Descriptive statistics

Standard deviation2.8646748 × 108
Coefficient of variation (CV)0.57794795
Kurtosis-1.1927241
Mean4.9566311 × 108
Median Absolute Deviation (MAD)2.4643025 × 108
Skewness0.0050514277
Sum5.1063213 × 1012
Variance8.2063617 × 1016
MonotonicityNot monotonic
2023-04-29T12:08:00.835156image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
341162899 5
 
< 0.1%
747557690 5
 
< 0.1%
173124759 5
 
< 0.1%
632067262 5
 
< 0.1%
750731752 4
 
< 0.1%
303183248 4
 
< 0.1%
132609655 4
 
< 0.1%
59026158 4
 
< 0.1%
983800811 4
 
< 0.1%
22340563 4
 
< 0.1%
Other values (8743) 10258
99.6%
ValueCountFrequency (%)
63175 1
< 0.1%
246910 1
< 0.1%
401276 1
< 0.1%
813128 2
< 0.1%
1307371 2
< 0.1%
1514697 1
< 0.1%
1541149 1
< 0.1%
1627973 1
< 0.1%
1780186 1
< 0.1%
1860885 1
< 0.1%
ValueCountFrequency (%)
999926368 2
< 0.1%
999800537 1
< 0.1%
999640290 1
< 0.1%
999577084 1
< 0.1%
999482663 1
< 0.1%
999457398 1
< 0.1%
999331839 1
< 0.1%
999178959 1
< 0.1%
999169190 1
< 0.1%
999158340 1
< 0.1%

KIDSDRIV
Categorical

Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size80.6 KiB
0
9069 
1
 
804
2
 
351
3
 
74
4
 
4

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters10302
Distinct characters5
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 9069
88.0%
1 804
 
7.8%
2 351
 
3.4%
3 74
 
0.7%
4 4
 
< 0.1%

Length

2023-04-29T12:08:00.992771image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-04-29T12:08:01.123419image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
0 9069
88.0%
1 804
 
7.8%
2 351
 
3.4%
3 74
 
0.7%
4 4
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
0 9069
88.0%
1 804
 
7.8%
2 351
 
3.4%
3 74
 
0.7%
4 4
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 10302
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 9069
88.0%
1 804
 
7.8%
2 351
 
3.4%
3 74
 
0.7%
4 4
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
Common 10302
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 9069
88.0%
1 804
 
7.8%
2 351
 
3.4%
3 74
 
0.7%
4 4
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 10302
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 9069
88.0%
1 804
 
7.8%
2 351
 
3.4%
3 74
 
0.7%
4 4
 
< 0.1%

AGE
Real number (ℝ)

Distinct61
Distinct (%)0.6%
Missing7
Missing (%)0.1%
Infinite0
Infinite (%)0.0%
Mean44.837397
Minimum16
Maximum81
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size80.6 KiB
2023-04-29T12:08:01.271061image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum16
5-th percentile30
Q139
median45
Q351
95-th percentile59
Maximum81
Range65
Interquartile range (IQR)12

Descriptive statistics

Standard deviation8.606445
Coefficient of variation (CV)0.19194792
Kurtosis-0.080902596
Mean44.837397
Median Absolute Deviation (MAD)6
Skewness-0.034540655
Sum461601
Variance74.070896
MonotonicityNot monotonic
2023-04-29T12:08:01.435275image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
46 496
 
4.8%
45 488
 
4.7%
48 464
 
4.5%
47 451
 
4.4%
43 441
 
4.3%
41 429
 
4.2%
50 424
 
4.1%
44 423
 
4.1%
40 406
 
3.9%
42 404
 
3.9%
Other values (51) 5869
57.0%
ValueCountFrequency (%)
16 5
 
< 0.1%
17 2
 
< 0.1%
18 3
 
< 0.1%
19 8
 
0.1%
20 4
 
< 0.1%
21 12
 
0.1%
22 17
0.2%
23 12
 
0.1%
24 25
0.2%
25 32
0.3%
ValueCountFrequency (%)
81 1
 
< 0.1%
80 1
 
< 0.1%
76 1
 
< 0.1%
73 4
 
< 0.1%
72 4
 
< 0.1%
71 1
 
< 0.1%
70 6
 
0.1%
69 5
 
< 0.1%
68 8
0.1%
67 16
0.2%

HOMEKIDS
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct6
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.72044263
Minimum0
Maximum5
Zeros6694
Zeros (%)65.0%
Negative0
Negative (%)0.0%
Memory size80.6 KiB
2023-04-29T12:08:01.577900image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q31
95-th percentile3
Maximum5
Range5
Interquartile range (IQR)1

Descriptive statistics

Standard deviation1.1163232
Coefficient of variation (CV)1.5494963
Kurtosis0.6293464
Mean0.72044263
Median Absolute Deviation (MAD)0
Skewness1.3366776
Sum7422
Variance1.2461775
MonotonicityNot monotonic
2023-04-29T12:08:01.720172image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
0 6694
65.0%
2 1427
 
13.9%
1 1106
 
10.7%
3 856
 
8.3%
4 201
 
2.0%
5 18
 
0.2%
ValueCountFrequency (%)
0 6694
65.0%
1 1106
 
10.7%
2 1427
 
13.9%
3 856
 
8.3%
4 201
 
2.0%
5 18
 
0.2%
ValueCountFrequency (%)
5 18
 
0.2%
4 201
 
2.0%
3 856
 
8.3%
2 1427
 
13.9%
1 1106
 
10.7%
0 6694
65.0%

YOJ
Real number (ℝ)

MISSING  ZEROS 

Distinct21
Distinct (%)0.2%
Missing548
Missing (%)5.3%
Infinite0
Infinite (%)0.0%
Mean10.474062
Minimum0
Maximum23
Zeros807
Zeros (%)7.8%
Negative0
Negative (%)0.0%
Memory size80.6 KiB
2023-04-29T12:08:01.908399image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q19
median11
Q313
95-th percentile15
Maximum23
Range23
Interquartile range (IQR)4

Descriptive statistics

Standard deviation4.1089432
Coefficient of variation (CV)0.39229701
Kurtosis1.1448021
Mean10.474062
Median Absolute Deviation (MAD)2
Skewness-1.2008229
Sum102164
Variance16.883414
MonotonicityNot monotonic
2023-04-29T12:08:02.055595image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=21)
ValueCountFrequency (%)
12 1500
14.6%
11 1267
12.3%
13 1266
12.3%
14 996
9.7%
10 934
9.1%
0 807
7.8%
9 653
6.3%
15 583
 
5.7%
8 484
 
4.7%
7 384
 
3.7%
Other values (11) 880
8.5%
(Missing) 548
 
5.3%
ValueCountFrequency (%)
0 807
7.8%
1 7
 
0.1%
2 21
 
0.2%
3 38
 
0.4%
4 49
 
0.5%
5 124
 
1.2%
6 219
 
2.1%
7 384
3.7%
8 484
4.7%
9 653
6.3%
ValueCountFrequency (%)
23 2
 
< 0.1%
19 17
 
0.2%
18 33
 
0.3%
17 127
 
1.2%
16 243
 
2.4%
15 583
 
5.7%
14 996
9.7%
13 1266
12.3%
12 1500
14.6%
11 1267
12.3%

INCOME
Categorical

HIGH CARDINALITY  MISSING 

Distinct8151
Distinct (%)83.8%
Missing570
Missing (%)5.5%
Memory size80.6 KiB
$0
 
797
$61,790
 
5
$64,916
 
4
$48,509
 
4
$30,111
 
4
Other values (8146)
8918 

Length

Max length8
Median length7
Mean length6.714961
Min length2

Characters and Unicode

Total characters65350
Distinct characters12
Distinct categories3 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique7443 ?
Unique (%)76.5%

Sample

1st row$67,349
2nd row$91,449
3rd row$52,881
4th row$16,039
5th row$114,986

Common Values

ValueCountFrequency (%)
$0 797
 
7.7%
$61,790 5
 
< 0.1%
$64,916 4
 
< 0.1%
$48,509 4
 
< 0.1%
$30,111 4
 
< 0.1%
$43,393 4
 
< 0.1%
$26,840 4
 
< 0.1%
$38,290 3
 
< 0.1%
$2,346 3
 
< 0.1%
$82,398 3
 
< 0.1%
Other values (8141) 8901
86.4%
(Missing) 570
 
5.5%

Length

2023-04-29T12:08:02.645939image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
0 797
 
8.2%
61,790 5
 
0.1%
64,916 4
 
< 0.1%
48,509 4
 
< 0.1%
30,111 4
 
< 0.1%
43,393 4
 
< 0.1%
26,840 4
 
< 0.1%
107,375 3
 
< 0.1%
47,513 3
 
< 0.1%
19,599 3
 
< 0.1%
Other values (8141) 8901
91.5%

Most occurring characters

ValueCountFrequency (%)
$ 9732
14.9%
, 8884
13.6%
1 6016
9.2%
2 4914
7.5%
3 4851
7.4%
0 4799
7.3%
4 4642
7.1%
5 4597
7.0%
6 4531
6.9%
7 4252
6.5%
Other values (2) 8132
12.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 46734
71.5%
Currency Symbol 9732
 
14.9%
Other Punctuation 8884
 
13.6%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 6016
12.9%
2 4914
10.5%
3 4851
10.4%
0 4799
10.3%
4 4642
9.9%
5 4597
9.8%
6 4531
9.7%
7 4252
9.1%
9 4073
8.7%
8 4059
8.7%
Currency Symbol
ValueCountFrequency (%)
$ 9732
100.0%
Other Punctuation
ValueCountFrequency (%)
, 8884
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 65350
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
$ 9732
14.9%
, 8884
13.6%
1 6016
9.2%
2 4914
7.5%
3 4851
7.4%
0 4799
7.3%
4 4642
7.1%
5 4597
7.0%
6 4531
6.9%
7 4252
6.5%
Other values (2) 8132
12.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 65350
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
$ 9732
14.9%
, 8884
13.6%
1 6016
9.2%
2 4914
7.5%
3 4851
7.4%
0 4799
7.3%
4 4642
7.1%
5 4597
7.0%
6 4531
6.9%
7 4252
6.5%
Other values (2) 8132
12.4%

PARENT1
Boolean

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size10.2 KiB
False
8959 
True
1343 
ValueCountFrequency (%)
False 8959
87.0%
True 1343
 
13.0%
2023-04-29T12:08:02.783569image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

HOME_VAL
Categorical

HIGH CARDINALITY  MISSING 

Distinct6334
Distinct (%)65.1%
Missing575
Missing (%)5.6%
Memory size80.6 KiB
$0
2908 
$151,286
 
3
$214,584
 
3
$159,568
 
3
$167,505
 
3
Other values (6329)
6807 

Length

Max length8
Median length8
Mean length6.1581166
Min length2

Characters and Unicode

Total characters59900
Distinct characters12
Distinct categories3 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique5883 ?
Unique (%)60.5%

Sample

1st row$0
2nd row$257,252
3rd row$0
4th row$124,191
5th row$306,251

Common Values

ValueCountFrequency (%)
$0 2908
28.2%
$151,286 3
 
< 0.1%
$214,584 3
 
< 0.1%
$159,568 3
 
< 0.1%
$167,505 3
 
< 0.1%
$99,103 3
 
< 0.1%
$332,673 3
 
< 0.1%
$166,481 3
 
< 0.1%
$178,852 3
 
< 0.1%
$165,641 3
 
< 0.1%
Other values (6324) 6792
65.9%
(Missing) 575
 
5.6%

Length

2023-04-29T12:08:02.905242image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
0 2908
29.9%
225,111 3
 
< 0.1%
121,949 3
 
< 0.1%
153,061 3
 
< 0.1%
117,038 3
 
< 0.1%
196,320 3
 
< 0.1%
176,219 3
 
< 0.1%
154,672 3
 
< 0.1%
244,764 3
 
< 0.1%
288,592 3
 
< 0.1%
Other values (6324) 6792
69.8%

Most occurring characters

ValueCountFrequency (%)
$ 9727
16.2%
, 6819
11.4%
0 6361
10.6%
1 6141
10.3%
2 5872
9.8%
3 4193
7.0%
4 3591
 
6.0%
5 3496
 
5.8%
8 3447
 
5.8%
6 3432
 
5.7%
Other values (2) 6821
11.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 43354
72.4%
Currency Symbol 9727
 
16.2%
Other Punctuation 6819
 
11.4%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 6361
14.7%
1 6141
14.2%
2 5872
13.5%
3 4193
9.7%
4 3591
8.3%
5 3496
8.1%
8 3447
8.0%
6 3432
7.9%
9 3426
7.9%
7 3395
7.8%
Currency Symbol
ValueCountFrequency (%)
$ 9727
100.0%
Other Punctuation
ValueCountFrequency (%)
, 6819
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 59900
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
$ 9727
16.2%
, 6819
11.4%
0 6361
10.6%
1 6141
10.3%
2 5872
9.8%
3 4193
7.0%
4 3591
 
6.0%
5 3496
 
5.8%
8 3447
 
5.8%
6 3432
 
5.7%
Other values (2) 6821
11.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 59900
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
$ 9727
16.2%
, 6819
11.4%
0 6361
10.6%
1 6141
10.3%
2 5872
9.8%
3 4193
7.0%
4 3591
 
6.0%
5 3496
 
5.8%
8 3447
 
5.8%
6 3432
 
5.7%
Other values (2) 6821
11.4%

MSTATUS
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size80.6 KiB
Yes
6188 
z_No
4114 

Length

Max length4
Median length3
Mean length3.3993399
Min length3

Characters and Unicode

Total characters35020
Distinct characters7
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowz_No
2nd rowz_No
3rd rowz_No
4th rowYes
5th rowYes

Common Values

ValueCountFrequency (%)
Yes 6188
60.1%
z_No 4114
39.9%

Length

2023-04-29T12:08:03.035893image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-04-29T12:08:03.158565image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
yes 6188
60.1%
z_no 4114
39.9%

Most occurring characters

ValueCountFrequency (%)
Y 6188
17.7%
e 6188
17.7%
s 6188
17.7%
z 4114
11.7%
_ 4114
11.7%
N 4114
11.7%
o 4114
11.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 20604
58.8%
Uppercase Letter 10302
29.4%
Connector Punctuation 4114
 
11.7%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 6188
30.0%
s 6188
30.0%
z 4114
20.0%
o 4114
20.0%
Uppercase Letter
ValueCountFrequency (%)
Y 6188
60.1%
N 4114
39.9%
Connector Punctuation
ValueCountFrequency (%)
_ 4114
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 30906
88.3%
Common 4114
 
11.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
Y 6188
20.0%
e 6188
20.0%
s 6188
20.0%
z 4114
13.3%
N 4114
13.3%
o 4114
13.3%
Common
ValueCountFrequency (%)
_ 4114
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 35020
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
Y 6188
17.7%
e 6188
17.7%
s 6188
17.7%
z 4114
11.7%
_ 4114
11.7%
N 4114
11.7%
o 4114
11.7%

GENDER
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size80.6 KiB
z_F
5545 
M
4757 

Length

Max length3
Median length3
Mean length2.07649
Min length1

Characters and Unicode

Total characters21392
Distinct characters4
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowM
2nd rowM
3rd rowM
4th rowz_F
5th rowM

Common Values

ValueCountFrequency (%)
z_F 5545
53.8%
M 4757
46.2%

Length

2023-04-29T12:08:03.274293image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-04-29T12:08:03.411922image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
z_f 5545
53.8%
m 4757
46.2%

Most occurring characters

ValueCountFrequency (%)
z 5545
25.9%
_ 5545
25.9%
F 5545
25.9%
M 4757
22.2%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 10302
48.2%
Lowercase Letter 5545
25.9%
Connector Punctuation 5545
25.9%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
F 5545
53.8%
M 4757
46.2%
Lowercase Letter
ValueCountFrequency (%)
z 5545
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 5545
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 15847
74.1%
Common 5545
 
25.9%

Most frequent character per script

Latin
ValueCountFrequency (%)
z 5545
35.0%
F 5545
35.0%
M 4757
30.0%
Common
ValueCountFrequency (%)
_ 5545
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 21392
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
z 5545
25.9%
_ 5545
25.9%
F 5545
25.9%
M 4757
22.2%

EDUCATION
Categorical

Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size80.6 KiB
z_High School
2952 
Bachelors
2823 
Masters
2078 
<High School
1515 
PhD
934 

Length

Max length13
Median length12
Mean length9.6399728
Min length3

Characters and Unicode

Total characters99311
Distinct characters21
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowPhD
2nd rowz_High School
3rd rowBachelors
4th rowz_High School
5th row<High School

Common Values

ValueCountFrequency (%)
z_High School 2952
28.7%
Bachelors 2823
27.4%
Masters 2078
20.2%
<High School 1515
14.7%
PhD 934
 
9.1%

Length

2023-04-29T12:08:03.515475image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-04-29T12:08:03.651190image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
school 4467
30.2%
z_high 2952
20.0%
bachelors 2823
19.1%
masters 2078
14.1%
high 1515
 
10.3%
phd 934
 
6.3%

Most occurring characters

ValueCountFrequency (%)
h 12691
12.8%
o 11757
 
11.8%
l 7290
 
7.3%
c 7290
 
7.3%
s 6979
 
7.0%
e 4901
 
4.9%
r 4901
 
4.9%
a 4901
 
4.9%
H 4467
 
4.5%
i 4467
 
4.5%
Other values (11) 29667
29.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 74674
75.2%
Uppercase Letter 15703
 
15.8%
Space Separator 4467
 
4.5%
Connector Punctuation 2952
 
3.0%
Math Symbol 1515
 
1.5%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
h 12691
17.0%
o 11757
15.7%
l 7290
9.8%
c 7290
9.8%
s 6979
9.3%
e 4901
 
6.6%
r 4901
 
6.6%
a 4901
 
6.6%
i 4467
 
6.0%
g 4467
 
6.0%
Other values (2) 5030
 
6.7%
Uppercase Letter
ValueCountFrequency (%)
H 4467
28.4%
S 4467
28.4%
B 2823
18.0%
M 2078
13.2%
P 934
 
5.9%
D 934
 
5.9%
Space Separator
ValueCountFrequency (%)
4467
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 2952
100.0%
Math Symbol
ValueCountFrequency (%)
< 1515
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 90377
91.0%
Common 8934
 
9.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
h 12691
14.0%
o 11757
13.0%
l 7290
 
8.1%
c 7290
 
8.1%
s 6979
 
7.7%
e 4901
 
5.4%
r 4901
 
5.4%
a 4901
 
5.4%
H 4467
 
4.9%
i 4467
 
4.9%
Other values (8) 20733
22.9%
Common
ValueCountFrequency (%)
4467
50.0%
_ 2952
33.0%
< 1515
 
17.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 99311
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
h 12691
12.8%
o 11757
 
11.8%
l 7290
 
7.3%
c 7290
 
7.3%
s 6979
 
7.0%
e 4901
 
4.9%
r 4901
 
4.9%
a 4901
 
4.9%
H 4467
 
4.5%
i 4467
 
4.5%
Other values (11) 29667
29.9%

OCCUPATION
Categorical

HIGH CORRELATION  MISSING 

Distinct8
Distinct (%)0.1%
Missing665
Missing (%)6.5%
Memory size80.6 KiB
z_Blue Collar
2288 
Clerical
1590 
Professional
1408 
Manager
1257 
Lawyer
1031 
Other values (3)
2063 

Length

Max length13
Median length10
Mean length9.44215
Min length6

Characters and Unicode

Total characters90994
Distinct characters29
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowProfessional
2nd rowz_Blue Collar
3rd rowManager
4th rowClerical
5th rowz_Blue Collar

Common Values

ValueCountFrequency (%)
z_Blue Collar 2288
22.2%
Clerical 1590
15.4%
Professional 1408
13.7%
Manager 1257
12.2%
Lawyer 1031
10.0%
Student 899
 
8.7%
Home Maker 843
 
8.2%
Doctor 321
 
3.1%
(Missing) 665
 
6.5%

Length

2023-04-29T12:08:03.806775image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-04-29T12:08:03.964023image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
z_blue 2288
17.9%
collar 2288
17.9%
clerical 1590
12.5%
professional 1408
11.0%
manager 1257
9.8%
lawyer 1031
8.1%
student 899
 
7.0%
home 843
 
6.6%
maker 843
 
6.6%
doctor 321
 
2.5%

Most occurring characters

ValueCountFrequency (%)
l 11452
12.6%
e 10159
 
11.2%
a 9674
 
10.6%
r 8738
 
9.6%
o 6589
 
7.2%
C 3878
 
4.3%
n 3564
 
3.9%
u 3187
 
3.5%
3131
 
3.4%
i 2998
 
3.3%
Other values (19) 27624
30.4%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 72807
80.0%
Uppercase Letter 12768
 
14.0%
Space Separator 3131
 
3.4%
Connector Punctuation 2288
 
2.5%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
l 11452
15.7%
e 10159
14.0%
a 9674
13.3%
r 8738
12.0%
o 6589
9.0%
n 3564
 
4.9%
u 3187
 
4.4%
i 2998
 
4.1%
s 2816
 
3.9%
z 2288
 
3.1%
Other values (9) 11342
15.6%
Uppercase Letter
ValueCountFrequency (%)
C 3878
30.4%
B 2288
17.9%
M 2100
16.4%
P 1408
 
11.0%
L 1031
 
8.1%
S 899
 
7.0%
H 843
 
6.6%
D 321
 
2.5%
Space Separator
ValueCountFrequency (%)
3131
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 2288
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 85575
94.0%
Common 5419
 
6.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
l 11452
13.4%
e 10159
11.9%
a 9674
11.3%
r 8738
 
10.2%
o 6589
 
7.7%
C 3878
 
4.5%
n 3564
 
4.2%
u 3187
 
3.7%
i 2998
 
3.5%
s 2816
 
3.3%
Other values (17) 22520
26.3%
Common
ValueCountFrequency (%)
3131
57.8%
_ 2288
42.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 90994
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
l 11452
12.6%
e 10159
 
11.2%
a 9674
 
10.6%
r 8738
 
9.6%
o 6589
 
7.2%
C 3878
 
4.3%
n 3564
 
3.9%
u 3187
 
3.5%
3131
 
3.4%
i 2998
 
3.3%
Other values (19) 27624
30.4%

TRAVTIME
Real number (ℝ)

Distinct100
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean33.416424
Minimum5
Maximum142
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size80.6 KiB
2023-04-29T12:08:04.149487image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum5
5-th percentile7
Q122
median33
Q344
95-th percentile60
Maximum142
Range137
Interquartile range (IQR)22

Descriptive statistics

Standard deviation15.869687
Coefficient of variation (CV)0.4749068
Kurtosis0.59465625
Mean33.416424
Median Absolute Deviation (MAD)11
Skewness0.43552716
Sum344256
Variance251.84696
MonotonicityNot monotonic
2023-04-29T12:08:04.318715image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
5 427
 
4.1%
32 288
 
2.8%
35 271
 
2.6%
33 268
 
2.6%
36 266
 
2.6%
30 265
 
2.6%
37 264
 
2.6%
29 259
 
2.5%
25 257
 
2.5%
24 253
 
2.5%
Other values (90) 7484
72.6%
ValueCountFrequency (%)
5 427
4.1%
6 66
 
0.6%
7 56
 
0.5%
8 69
 
0.7%
9 86
 
0.8%
10 101
 
1.0%
11 90
 
0.9%
12 122
 
1.2%
13 121
 
1.2%
14 133
 
1.3%
ValueCountFrequency (%)
142 1
< 0.1%
134 1
< 0.1%
124 1
< 0.1%
113 1
< 0.1%
105 1
< 0.1%
103 1
< 0.1%
101 1
< 0.1%
99 1
< 0.1%
98 1
< 0.1%
97 2
< 0.1%

CAR_USE
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size80.6 KiB
Private
6513 
Commercial
3789 

Length

Max length10
Median length7
Mean length8.103378
Min length7

Characters and Unicode

Total characters83481
Distinct characters12
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowPrivate
2nd rowCommercial
3rd rowPrivate
4th rowPrivate
5th rowPrivate

Common Values

ValueCountFrequency (%)
Private 6513
63.2%
Commercial 3789
36.8%

Length

2023-04-29T12:08:04.481286image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-04-29T12:08:04.623941image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
private 6513
63.2%
commercial 3789
36.8%

Most occurring characters

ValueCountFrequency (%)
r 10302
12.3%
i 10302
12.3%
a 10302
12.3%
e 10302
12.3%
m 7578
9.1%
P 6513
7.8%
v 6513
7.8%
t 6513
7.8%
C 3789
 
4.5%
o 3789
 
4.5%
Other values (2) 7578
9.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 73179
87.7%
Uppercase Letter 10302
 
12.3%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
r 10302
14.1%
i 10302
14.1%
a 10302
14.1%
e 10302
14.1%
m 7578
10.4%
v 6513
8.9%
t 6513
8.9%
o 3789
 
5.2%
c 3789
 
5.2%
l 3789
 
5.2%
Uppercase Letter
ValueCountFrequency (%)
P 6513
63.2%
C 3789
36.8%

Most occurring scripts

ValueCountFrequency (%)
Latin 83481
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
r 10302
12.3%
i 10302
12.3%
a 10302
12.3%
e 10302
12.3%
m 7578
9.1%
P 6513
7.8%
v 6513
7.8%
t 6513
7.8%
C 3789
 
4.5%
o 3789
 
4.5%
Other values (2) 7578
9.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 83481
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
r 10302
12.3%
i 10302
12.3%
a 10302
12.3%
e 10302
12.3%
m 7578
9.1%
P 6513
7.8%
v 6513
7.8%
t 6513
7.8%
C 3789
 
4.5%
o 3789
 
4.5%
Other values (2) 7578
9.1%

BLUEBOOK
Categorical

Distinct2985
Distinct (%)29.0%
Missing0
Missing (%)0.0%
Memory size80.6 KiB
$1,500
 
207
$6,200
 
47
$6,000
 
42
$5,800
 
39
$5,400
 
38
Other values (2980)
9929 

Length

Max length7
Median length7
Mean length6.7128713
Min length6

Characters and Unicode

Total characters69156
Distinct characters12
Distinct categories3 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique834 ?
Unique (%)8.1%

Sample

1st row$14,230
2nd row$14,940
3rd row$21,970
4th row$4,010
5th row$15,440

Common Values

ValueCountFrequency (%)
$1,500 207
 
2.0%
$6,200 47
 
0.5%
$6,000 42
 
0.4%
$5,800 39
 
0.4%
$5,400 38
 
0.4%
$5,600 38
 
0.4%
$5,900 37
 
0.4%
$5,700 36
 
0.3%
$6,500 36
 
0.3%
$6,400 35
 
0.3%
Other values (2975) 9747
94.6%

Length

2023-04-29T12:08:04.748912image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
1,500 207
 
2.0%
6,200 47
 
0.5%
6,000 42
 
0.4%
5,800 39
 
0.4%
5,400 38
 
0.4%
5,600 38
 
0.4%
5,900 37
 
0.4%
5,700 36
 
0.3%
6,500 36
 
0.3%
6,100 35
 
0.3%
Other values (2975) 9747
94.6%

Most occurring characters

ValueCountFrequency (%)
0 14152
20.5%
$ 10302
14.9%
, 10302
14.9%
1 7572
10.9%
2 5057
 
7.3%
3 3419
 
4.9%
5 3323
 
4.8%
6 3148
 
4.6%
4 3059
 
4.4%
7 3047
 
4.4%
Other values (2) 5775
8.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 48552
70.2%
Currency Symbol 10302
 
14.9%
Other Punctuation 10302
 
14.9%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 14152
29.1%
1 7572
15.6%
2 5057
 
10.4%
3 3419
 
7.0%
5 3323
 
6.8%
6 3148
 
6.5%
4 3059
 
6.3%
7 3047
 
6.3%
8 2932
 
6.0%
9 2843
 
5.9%
Currency Symbol
ValueCountFrequency (%)
$ 10302
100.0%
Other Punctuation
ValueCountFrequency (%)
, 10302
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 69156
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 14152
20.5%
$ 10302
14.9%
, 10302
14.9%
1 7572
10.9%
2 5057
 
7.3%
3 3419
 
4.9%
5 3323
 
4.8%
6 3148
 
4.6%
4 3059
 
4.4%
7 3047
 
4.4%
Other values (2) 5775
8.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 69156
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 14152
20.5%
$ 10302
14.9%
, 10302
14.9%
1 7572
10.9%
2 5057
 
7.3%
3 3419
 
4.9%
5 3323
 
4.8%
6 3148
 
4.6%
4 3059
 
4.4%
7 3047
 
4.4%
Other values (2) 5775
8.4%

TIF
Real number (ℝ)

Distinct23
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5.3291594
Minimum1
Maximum25
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size80.6 KiB
2023-04-29T12:08:04.877681image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q11
median4
Q37
95-th percentile13
Maximum25
Range24
Interquartile range (IQR)6

Descriptive statistics

Standard deviation4.1107947
Coefficient of variation (CV)0.7713777
Kurtosis0.47970857
Mean5.3291594
Median Absolute Deviation (MAD)3
Skewness0.89941526
Sum54901
Variance16.898633
MonotonicityNot monotonic
2023-04-29T12:08:05.008231image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=23)
ValueCountFrequency (%)
1 3172
30.8%
6 1707
16.6%
4 1616
15.7%
10 951
 
9.2%
7 781
 
7.6%
3 531
 
5.2%
13 355
 
3.4%
11 300
 
2.9%
9 299
 
2.9%
17 126
 
1.2%
Other values (13) 464
 
4.5%
ValueCountFrequency (%)
1 3172
30.8%
2 6
 
0.1%
3 531
 
5.2%
4 1616
15.7%
5 70
 
0.7%
6 1707
16.6%
7 781
 
7.6%
8 83
 
0.8%
9 299
 
2.9%
10 951
 
9.2%
ValueCountFrequency (%)
25 3
 
< 0.1%
22 3
 
< 0.1%
21 13
 
0.1%
20 12
 
0.1%
19 11
 
0.1%
18 26
 
0.3%
17 126
1.2%
16 50
 
0.5%
15 40
 
0.4%
14 92
0.9%

CAR_TYPE
Categorical

Distinct6
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size80.6 KiB
z_SUV
2883 
Minivan
2694 
Pickup
1772 
Sports Car
1179 
Van
921 

Length

Max length11
Median length10
Mean length6.5852262
Min length3

Characters and Unicode

Total characters67841
Distinct characters24
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowMinivan
2nd rowMinivan
3rd rowVan
4th rowz_SUV
5th rowMinivan

Common Values

ValueCountFrequency (%)
z_SUV 2883
28.0%
Minivan 2694
26.2%
Pickup 1772
17.2%
Sports Car 1179
11.4%
Van 921
 
8.9%
Panel Truck 853
 
8.3%

Length

2023-04-29T12:08:05.181670image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-04-29T12:08:05.336143image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
z_suv 2883
23.4%
minivan 2694
21.8%
pickup 1772
14.4%
sports 1179
9.6%
car 1179
9.6%
van 921
 
7.5%
panel 853
 
6.9%
truck 853
 
6.9%

Most occurring characters

ValueCountFrequency (%)
n 7162
 
10.6%
i 7160
 
10.6%
a 5647
 
8.3%
S 4062
 
6.0%
V 3804
 
5.6%
r 3211
 
4.7%
p 2951
 
4.3%
z 2883
 
4.2%
U 2883
 
4.2%
_ 2883
 
4.2%
Other values (14) 25195
37.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 44826
66.1%
Uppercase Letter 18100
26.7%
Connector Punctuation 2883
 
4.2%
Space Separator 2032
 
3.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
n 7162
16.0%
i 7160
16.0%
a 5647
12.6%
r 3211
7.2%
p 2951
6.6%
z 2883
6.4%
v 2694
 
6.0%
u 2625
 
5.9%
k 2625
 
5.9%
c 2625
 
5.9%
Other values (5) 5243
11.7%
Uppercase Letter
ValueCountFrequency (%)
S 4062
22.4%
V 3804
21.0%
U 2883
15.9%
M 2694
14.9%
P 2625
14.5%
C 1179
 
6.5%
T 853
 
4.7%
Connector Punctuation
ValueCountFrequency (%)
_ 2883
100.0%
Space Separator
ValueCountFrequency (%)
2032
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 62926
92.8%
Common 4915
 
7.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
n 7162
 
11.4%
i 7160
 
11.4%
a 5647
 
9.0%
S 4062
 
6.5%
V 3804
 
6.0%
r 3211
 
5.1%
p 2951
 
4.7%
z 2883
 
4.6%
U 2883
 
4.6%
M 2694
 
4.3%
Other values (12) 20469
32.5%
Common
ValueCountFrequency (%)
_ 2883
58.7%
2032
41.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 67841
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
n 7162
 
10.6%
i 7160
 
10.6%
a 5647
 
8.3%
S 4062
 
6.0%
V 3804
 
5.6%
r 3211
 
4.7%
p 2951
 
4.3%
z 2883
 
4.2%
U 2883
 
4.2%
_ 2883
 
4.2%
Other values (14) 25195
37.1%

RED_CAR
Boolean

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size10.2 KiB
False
7326 
True
2976 
ValueCountFrequency (%)
False 7326
71.1%
True 2976
28.9%
2023-04-29T12:08:05.480936image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

OLDCLAIM
Categorical

HIGH CARDINALITY  IMBALANCE 

Distinct3545
Distinct (%)34.4%
Missing0
Missing (%)0.0%
Memory size80.6 KiB
$0
6292 
$1,310
 
4
$4,448
 
4
$1,391
 
4
$4,188
 
4
Other values (3540)
3994 

Length

Max length7
Median length2
Mean length3.6257037
Min length2

Characters and Unicode

Total characters37352
Distinct characters12
Distinct categories3 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3130 ?
Unique (%)30.4%

Sample

1st row$4,461
2nd row$0
3rd row$0
4th row$38,690
5th row$0

Common Values

ValueCountFrequency (%)
$0 6292
61.1%
$1,310 4
 
< 0.1%
$4,448 4
 
< 0.1%
$1,391 4
 
< 0.1%
$4,188 4
 
< 0.1%
$4,538 4
 
< 0.1%
$4,263 4
 
< 0.1%
$1,105 4
 
< 0.1%
$3,960 3
 
< 0.1%
$3,068 3
 
< 0.1%
Other values (3535) 3976
38.6%

Length

2023-04-29T12:08:05.596069image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
0 6292
61.1%
4,448 4
 
< 0.1%
1,391 4
 
< 0.1%
4,188 4
 
< 0.1%
4,538 4
 
< 0.1%
4,263 4
 
< 0.1%
1,105 4
 
< 0.1%
1,310 4
 
< 0.1%
3,338 3
 
< 0.1%
6,985 3
 
< 0.1%
Other values (3535) 3976
38.6%

Most occurring characters

ValueCountFrequency (%)
$ 10302
27.6%
0 7636
20.4%
, 3882
 
10.4%
3 2012
 
5.4%
1 1963
 
5.3%
4 1815
 
4.9%
5 1769
 
4.7%
2 1763
 
4.7%
6 1626
 
4.4%
8 1552
 
4.2%
Other values (2) 3032
 
8.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 23168
62.0%
Currency Symbol 10302
27.6%
Other Punctuation 3882
 
10.4%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 7636
33.0%
3 2012
 
8.7%
1 1963
 
8.5%
4 1815
 
7.8%
5 1769
 
7.6%
2 1763
 
7.6%
6 1626
 
7.0%
8 1552
 
6.7%
7 1546
 
6.7%
9 1486
 
6.4%
Currency Symbol
ValueCountFrequency (%)
$ 10302
100.0%
Other Punctuation
ValueCountFrequency (%)
, 3882
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 37352
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
$ 10302
27.6%
0 7636
20.4%
, 3882
 
10.4%
3 2012
 
5.4%
1 1963
 
5.3%
4 1815
 
4.9%
5 1769
 
4.7%
2 1763
 
4.7%
6 1626
 
4.4%
8 1552
 
4.2%
Other values (2) 3032
 
8.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 37352
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
$ 10302
27.6%
0 7636
20.4%
, 3882
 
10.4%
3 2012
 
5.4%
1 1963
 
5.3%
4 1815
 
4.9%
5 1769
 
4.7%
2 1763
 
4.7%
6 1626
 
4.4%
8 1552
 
4.2%
Other values (2) 3032
 
8.1%

CLM_FREQ
Real number (ℝ)

Distinct6
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.80071831
Minimum0
Maximum5
Zeros6292
Zeros (%)61.1%
Negative0
Negative (%)0.0%
Memory size80.6 KiB
2023-04-29T12:08:05.708032image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q32
95-th percentile3
Maximum5
Range5
Interquartile range (IQR)2

Descriptive statistics

Standard deviation1.1540786
Coefficient of variation (CV)1.4413041
Kurtosis0.24591814
Mean0.80071831
Median Absolute Deviation (MAD)0
Skewness1.1940624
Sum8249
Variance1.3318974
MonotonicityNot monotonic
2023-04-29T12:08:05.822984image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
0 6292
61.1%
2 1492
 
14.5%
1 1279
 
12.4%
3 992
 
9.6%
4 225
 
2.2%
5 22
 
0.2%
ValueCountFrequency (%)
0 6292
61.1%
1 1279
 
12.4%
2 1492
 
14.5%
3 992
 
9.6%
4 225
 
2.2%
5 22
 
0.2%
ValueCountFrequency (%)
5 22
 
0.2%
4 225
 
2.2%
3 992
 
9.6%
2 1492
 
14.5%
1 1279
 
12.4%
0 6292
61.1%

REVOKED
Boolean

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size10.2 KiB
False
9041 
True
1261 
ValueCountFrequency (%)
False 9041
87.8%
True 1261
 
12.2%
2023-04-29T12:08:05.955102image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

MVR_PTS
Real number (ℝ)

Distinct14
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.7101534
Minimum0
Maximum13
Zeros4658
Zeros (%)45.2%
Negative0
Negative (%)0.0%
Memory size80.6 KiB
2023-04-29T12:08:06.049232image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median1
Q33
95-th percentile6
Maximum13
Range13
Interquartile range (IQR)3

Descriptive statistics

Standard deviation2.1590149
Coefficient of variation (CV)1.2624686
Kurtosis1.3358371
Mean1.7101534
Median Absolute Deviation (MAD)1
Skewness1.3405063
Sum17618
Variance4.6613453
MonotonicityNot monotonic
2023-04-29T12:08:06.173944image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=14)
ValueCountFrequency (%)
0 4658
45.2%
1 1467
 
14.2%
2 1199
 
11.6%
3 966
 
9.4%
4 727
 
7.1%
5 528
 
5.1%
6 341
 
3.3%
7 213
 
2.1%
8 114
 
1.1%
9 53
 
0.5%
Other values (4) 36
 
0.3%
ValueCountFrequency (%)
0 4658
45.2%
1 1467
 
14.2%
2 1199
 
11.6%
3 966
 
9.4%
4 727
 
7.1%
5 528
 
5.1%
6 341
 
3.3%
7 213
 
2.1%
8 114
 
1.1%
9 53
 
0.5%
ValueCountFrequency (%)
13 2
 
< 0.1%
12 1
 
< 0.1%
11 13
 
0.1%
10 20
 
0.2%
9 53
 
0.5%
8 114
 
1.1%
7 213
 
2.1%
6 341
3.3%
5 528
5.1%
4 727
7.1%

CLM_AMT
Categorical

HIGH CARDINALITY  IMBALANCE 

Distinct2346
Distinct (%)22.8%
Missing0
Missing (%)0.0%
Memory size80.6 KiB
$0
7556 
$2,327
 
4
$3,674
 
4
$3,350
 
4
$4,363
 
4
Other values (2341)
2730 

Length

Max length8
Median length2
Mean length3.0606678
Min length2

Characters and Unicode

Total characters31531
Distinct characters12
Distinct categories3 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1996 ?
Unique (%)19.4%

Sample

1st row$0
2nd row$0
3rd row$0
4th row$0
5th row$0

Common Values

ValueCountFrequency (%)
$0 7556
73.3%
$2,327 4
 
< 0.1%
$3,674 4
 
< 0.1%
$3,350 4
 
< 0.1%
$4,363 4
 
< 0.1%
$3,667 4
 
< 0.1%
$5,900 3
 
< 0.1%
$4,506 3
 
< 0.1%
$6,409 3
 
< 0.1%
$5,951 3
 
< 0.1%
Other values (2336) 2714
 
26.3%

Length

2023-04-29T12:08:06.311215image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
0 7556
73.3%
3,674 4
 
< 0.1%
3,350 4
 
< 0.1%
4,363 4
 
< 0.1%
3,667 4
 
< 0.1%
2,327 4
 
< 0.1%
6,879 3
 
< 0.1%
2,493 3
 
< 0.1%
1,479 3
 
< 0.1%
3,858 3
 
< 0.1%
Other values (2336) 2714
 
26.3%

Most occurring characters

ValueCountFrequency (%)
$ 10302
32.7%
0 8433
26.7%
, 2620
 
8.3%
3 1378
 
4.4%
4 1297
 
4.1%
2 1281
 
4.1%
1 1220
 
3.9%
5 1205
 
3.8%
6 1039
 
3.3%
7 946
 
3.0%
Other values (2) 1810
 
5.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 18609
59.0%
Currency Symbol 10302
32.7%
Other Punctuation 2620
 
8.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 8433
45.3%
3 1378
 
7.4%
4 1297
 
7.0%
2 1281
 
6.9%
1 1220
 
6.6%
5 1205
 
6.5%
6 1039
 
5.6%
7 946
 
5.1%
8 923
 
5.0%
9 887
 
4.8%
Currency Symbol
ValueCountFrequency (%)
$ 10302
100.0%
Other Punctuation
ValueCountFrequency (%)
, 2620
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 31531
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
$ 10302
32.7%
0 8433
26.7%
, 2620
 
8.3%
3 1378
 
4.4%
4 1297
 
4.1%
2 1281
 
4.1%
1 1220
 
3.9%
5 1205
 
3.8%
6 1039
 
3.3%
7 946
 
3.0%
Other values (2) 1810
 
5.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 31531
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
$ 10302
32.7%
0 8433
26.7%
, 2620
 
8.3%
3 1378
 
4.4%
4 1297
 
4.1%
2 1281
 
4.1%
1 1220
 
3.9%
5 1205
 
3.8%
6 1039
 
3.3%
7 946
 
3.0%
Other values (2) 1810
 
5.7%

CAR_AGE
Real number (ℝ)

Distinct30
Distinct (%)0.3%
Missing639
Missing (%)6.2%
Infinite0
Infinite (%)0.0%
Mean8.2981476
Minimum-3
Maximum28
Zeros4
Zeros (%)< 0.1%
Negative1
Negative (%)< 0.1%
Memory size80.6 KiB
2023-04-29T12:08:06.446853image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum-3
5-th percentile1
Q11
median8
Q312
95-th percentile18
Maximum28
Range31
Interquartile range (IQR)11

Descriptive statistics

Standard deviation5.7144502
Coefficient of variation (CV)0.68864167
Kurtosis-0.76432996
Mean8.2981476
Median Absolute Deviation (MAD)5
Skewness0.28046053
Sum80185
Variance32.654941
MonotonicityNot monotonic
2023-04-29T12:08:06.577503image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=30)
ValueCountFrequency (%)
1 2489
24.2%
8 696
 
6.8%
9 659
 
6.4%
7 655
 
6.4%
10 600
 
5.8%
6 552
 
5.4%
11 549
 
5.3%
12 467
 
4.5%
13 450
 
4.4%
14 396
 
3.8%
Other values (20) 2150
20.9%
(Missing) 639
 
6.2%
ValueCountFrequency (%)
-3 1
 
< 0.1%
0 4
 
< 0.1%
1 2489
24.2%
2 18
 
0.2%
3 70
 
0.7%
4 169
 
1.6%
5 360
 
3.5%
6 552
 
5.4%
7 655
 
6.4%
8 696
 
6.8%
ValueCountFrequency (%)
28 1
 
< 0.1%
27 1
 
< 0.1%
26 3
 
< 0.1%
25 8
 
0.1%
24 13
 
0.1%
23 22
 
0.2%
22 33
 
0.3%
21 65
0.6%
20 112
1.1%
19 155
1.5%

CLAIM_FLAG
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size80.6 KiB
0
7556 
1
2746 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters10302
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 7556
73.3%
1 2746
 
26.7%

Length

2023-04-29T12:08:06.708671image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-04-29T12:08:06.826355image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
0 7556
73.3%
1 2746
 
26.7%

Most occurring characters

ValueCountFrequency (%)
0 7556
73.3%
1 2746
 
26.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 10302
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 7556
73.3%
1 2746
 
26.7%

Most occurring scripts

ValueCountFrequency (%)
Common 10302
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 7556
73.3%
1 2746
 
26.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 10302
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 7556
73.3%
1 2746
 
26.7%

Interactions

2023-04-29T12:07:57.813123image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-04-29T12:07:45.362587image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-04-29T12:07:46.910779image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-04-29T12:07:48.341753image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-04-29T12:07:50.124280image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-04-29T12:07:51.658369image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-04-29T12:07:53.182267image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-04-29T12:07:54.634177image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-04-29T12:07:56.098270image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-04-29T12:07:57.999321image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-04-29T12:07:45.558301image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-04-29T12:07:47.060549image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-04-29T12:07:48.496845image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-04-29T12:07:50.301804image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-04-29T12:07:51.815946image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-04-29T12:07:53.361785image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-04-29T12:07:54.796746image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-04-29T12:07:56.270322image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-04-29T12:07:58.166798image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-04-29T12:07:45.719871image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-04-29T12:07:47.216339image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-04-29T12:07:48.685849image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-04-29T12:07:50.460378image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-04-29T12:07:51.964554image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-04-29T12:07:53.525351image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-04-29T12:07:54.941106image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-04-29T12:07:56.498712image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-04-29T12:07:58.320895image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-04-29T12:07:45.940280image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-04-29T12:07:47.358692image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-04-29T12:07:48.842056image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-04-29T12:07:50.607984image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-04-29T12:07:52.141762image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-04-29T12:07:53.703909image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-04-29T12:07:55.089708image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-04-29T12:07:56.726102image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-04-29T12:07:58.463807image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-04-29T12:07:46.113816image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-04-29T12:07:47.505337image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-04-29T12:07:48.976749image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-04-29T12:07:50.790495image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-04-29T12:07:52.294352image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-04-29T12:07:53.841503image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-04-29T12:07:55.265239image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-04-29T12:07:56.878697image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-04-29T12:07:58.648312image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-04-29T12:07:46.279377image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-04-29T12:07:47.713742image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-04-29T12:07:49.140990image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-04-29T12:07:50.941639image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-04-29T12:07:52.469884image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-04-29T12:07:53.988109image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-04-29T12:07:55.442764image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-04-29T12:07:57.165450image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-04-29T12:07:58.793924image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-04-29T12:07:46.435959image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-04-29T12:07:47.861593image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-04-29T12:07:49.290853image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-04-29T12:07:51.109406image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-04-29T12:07:52.664364image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-04-29T12:07:54.149679image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-04-29T12:07:55.588885image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-04-29T12:07:57.322548image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-04-29T12:07:58.941334image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-04-29T12:07:46.599035image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-04-29T12:07:48.022129image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-04-29T12:07:49.430663image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-04-29T12:07:51.349369image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-04-29T12:07:52.822946image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-04-29T12:07:54.317093image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-04-29T12:07:55.737494image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-04-29T12:07:57.479180image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-04-29T12:07:59.088936image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-04-29T12:07:46.767372image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-04-29T12:07:48.186204image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-04-29T12:07:49.967355image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-04-29T12:07:51.505952image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-04-29T12:07:52.975878image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-04-29T12:07:54.475678image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-04-29T12:07:55.954911image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-04-29T12:07:57.655737image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Correlations

2023-04-29T12:08:06.943043image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
IDAGEHOMEKIDSYOJTRAVTIMETIFCLM_FREQMVR_PTSCAR_AGEKIDSDRIVPARENT1MSTATUSGENDEREDUCATIONOCCUPATIONCAR_USECAR_TYPERED_CARREVOKEDCLAIM_FLAG
ID1.000-0.0130.009-0.005-0.001-0.0120.0010.010-0.0030.0180.0100.0140.0040.0210.0120.0000.0000.0220.0000.000
AGE-0.0131.000-0.5150.149-0.001-0.002-0.047-0.0620.1850.1660.3260.0900.0760.1260.1080.0630.0950.0710.0450.159
HOMEKIDS0.009-0.5151.0000.137-0.0030.0030.0550.055-0.1670.3140.5280.0430.1280.1060.1080.0000.0530.0760.0440.134
YOJ-0.0050.1490.1371.000-0.0100.013-0.018-0.0320.0400.0770.0610.2470.1150.0660.2410.0580.0700.0770.0000.083
TRAVTIME-0.001-0.001-0.003-0.0101.000-0.0090.0120.010-0.0330.0240.0290.0140.0000.0290.0410.0000.0000.0000.0000.062
TIF-0.012-0.0020.0030.013-0.0091.000-0.021-0.0350.0010.0000.0190.0000.0150.0000.0060.0000.0030.0230.0270.085
CLM_FREQ0.001-0.0470.055-0.0180.012-0.0211.0000.418-0.0240.0220.0650.0700.0130.0280.0310.0800.0390.0250.0750.249
MVR_PTS0.010-0.0620.055-0.0320.010-0.0350.4181.000-0.0200.0300.0720.0460.0000.0260.0280.0660.0270.0110.0560.223
CAR_AGE-0.0030.185-0.1670.040-0.0330.001-0.024-0.0201.0000.0250.0680.0330.0300.4250.2350.0920.0570.0250.0300.110
KIDSDRIV0.0180.1660.3140.0770.0240.0000.0220.0300.0251.0000.2310.0390.0470.0380.0430.0000.0200.0450.0350.112
PARENT10.0100.3260.5280.0610.0290.0190.0650.0720.0680.2311.0000.4740.0680.0900.0930.0000.0550.0430.0490.158
MSTATUS0.0140.0900.0430.2470.0140.0000.0700.0460.0330.0390.4741.0000.0000.0460.0300.0060.0000.0090.0390.129
GENDER0.0040.0760.1280.1150.0000.0150.0130.0000.0300.0470.0680.0001.0000.0500.2510.2820.7170.6630.0050.019
EDUCATION0.0210.1260.1060.0660.0290.0000.0280.0260.4250.0380.0900.0460.0501.0000.5600.2170.0940.0300.0170.151
OCCUPATION0.0120.1080.1080.2410.0410.0060.0310.0280.2350.0430.0930.0300.2510.5601.0000.5730.1360.1730.0260.188
CAR_USE0.0000.0630.0000.0580.0000.0000.0800.0660.0920.0000.0000.0060.2820.2170.5731.0000.5390.1890.0060.136
CAR_TYPE0.0000.0950.0530.0700.0000.0030.0390.0270.0570.0200.0550.0000.7170.0940.1360.5391.0000.4860.0330.134
RED_CAR0.0220.0710.0760.0770.0000.0230.0250.0110.0250.0450.0430.0090.6630.0300.1730.1890.4861.0000.0000.000
REVOKED0.0000.0450.0440.0000.0000.0270.0750.0560.0300.0350.0490.0390.0050.0170.0260.0060.0330.0001.0000.155
CLAIM_FLAG0.0000.1590.1340.0830.0620.0850.2490.2230.1100.1120.1580.1290.0190.1510.1880.1360.1340.0000.1551.000

Missing values

2023-04-29T12:07:59.366705image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
A simple visualization of nullity by column.
2023-04-29T12:08:00.078210image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-04-29T12:08:00.380848image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

IDKIDSDRIVAGEHOMEKIDSYOJINCOMEPARENT1HOME_VALMSTATUSGENDEREDUCATIONOCCUPATIONTRAVTIMECAR_USEBLUEBOOKTIFCAR_TYPERED_CAROLDCLAIMCLM_FREQREVOKEDMVR_PTSCLM_AMTCAR_AGECLAIM_FLAG
063581743060.0011.0$67,349No$0z_NoMPhDProfessional14Private$14,23011Minivanyes$4,4612No3$018.00
1132761049043.0011.0$91,449No$257,252z_NoMz_High Schoolz_Blue Collar22Commercial$14,9401Minivanyes$00No0$01.00
2921317019048.0011.0$52,881No$0z_NoMBachelorsManager26Private$21,9701Vanyes$00No2$010.00
3727598473035.0110.0$16,039No$124,191Yesz_Fz_High SchoolClerical5Private$4,0104z_SUVno$38,6902No3$010.00
4450221861051.0014.0NaNNo$306,251YesM<High Schoolz_Blue Collar32Private$15,4407Minivanyes$00No0$06.00
5743146596050.00NaN$114,986No$243,925Yesz_FPhDDoctor36Private$18,0001z_SUVno$19,2172Yes3$017.00
6871024631034.0112.0$125,301Yes$0z_Noz_FBachelorsz_Blue Collar46Commercial$17,4301Sports Carno$00No0$2,9467.01
7792300541054.00NaN$18,755NoNaNYesz_F<High Schoolz_Blue Collar33Private$8,7801z_SUVno$00No0$01.00
87945239140.0111.0$50,815Yes$0z_NoMz_High SchoolManager21Private$18,9306Minivanno$3,2951No2$6,4771.01
93577610044.0212.0$43,486Yes$0z_Noz_Fz_High Schoolz_Blue Collar30Commercial$5,90010z_SUVno$00No0$010.00
IDKIDSDRIVAGEHOMEKIDSYOJINCOMEPARENT1HOME_VALMSTATUSGENDEREDUCATIONOCCUPATIONTRAVTIMECAR_USEBLUEBOOKTIFCAR_TYPERED_CAROLDCLAIMCLM_FREQREVOKEDMVR_PTSCLM_AMTCAR_AGECLAIM_FLAG
10292452807843048.0010.0$111,305No$0z_Noz_FPhDDoctor59Private$17,43013z_SUVno$00No4$018.00
10293814422920051.0010.0$128,523No$0z_NoMMastersNaN18Commercial$32,9606Panel Truckno$3,9953No1$3,28815.01
10294721196389138.0416.0$12,717No$0Yesz_FBachelorsStudent15Commercial$24,7401Pickupno$9,2453No3$015.00
10295215633551041.007.0$6,256No$0z_NoMz_High SchoolStudent41Private$5,6001Pickupno$00No0$07.00
10296121441578035.0011.0$43,112No$0z_NoMz_High Schoolz_Blue Collar51Commercial$27,33010Panel Truckyes$00No0$08.00
1029767790126145.029.0$164,669No$386,273YesMPhDManager21Private$13,27015Minivanno$00No2$017.00
1029861970712046.009.0$107,204No$332,591YesMMastersNaN36Commercial$24,4906Panel Truckno$00No0$01.00
10299849208064048.0015.0$39,837No$170,611Yesz_F<High Schoolz_Blue Collar12Private$13,8207z_SUVno$00No0$01.00
10300627828331050.007.0$43,445No$149,248Yesz_FBachelorsHome Maker36Private$22,5506Minivanno$00No0$011.00
10301680381960052.0011.0$53,235No$197,017Yesz_Fz_High SchoolClerical64Private$19,4006Minivanno$00No0$09.00

Duplicate rows

Most frequently occurring

IDKIDSDRIVAGEHOMEKIDSYOJINCOMEPARENT1HOME_VALMSTATUSGENDEREDUCATIONOCCUPATIONTRAVTIMECAR_USEBLUEBOOKTIFCAR_TYPERED_CAROLDCLAIMCLM_FREQREVOKEDMVR_PTSCLM_AMTCAR_AGECLAIM_FLAG# duplicates
0279799481039.0014.0$93,077No$244,764YesMBachelorsProfessional29Private$14,7101Minivanyes$00No0$01.002